An experimental comparison of multiple vocoder types

نویسندگان

Qiong Hu

Korin Richmond

Junichi Yamagishi

Javier Latorre

چکیده

This paper presents an experimental comparison of a broad range of the leading vocoder types which have been previously described. We use a reference implementation of each of these to create stimuli for a listening test using copy synthesis. The listening test is performed using both Lombard and normal read speech stimuli, and with two types of question for comparison. Multi-dimensional Scaling (MDS) is conducted on the listener responses to analyse similarities in terms of quality between the vocoders. Our MDS and clustering results show that the vocoders which use a sinusoidal synthesis approach are perceptually distinguishable from the source-filter vocoders. To help further interpret the axes of the resulting MDS space, we test for correlations with standard acoustic quality metrics and find one axis is strongly correlated with PESQ scores. We also find both speech style and the format of the listening test question may influence test results. Finally, we also present preference test results which compare each vocoder with the natural speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech coding using trajectory compression and multiple sensors

This paper presents a new method of multi-frame speech coding based upon polynomial approximation of speech feature trajectories incorporating multiple sensor signals from microphones, accelerometer, electro-glottograph, and microradar. The trajectory polynomial approximation exploits the inter-frame information redundancy encountered in natural speech. The trajectory method is applicable to fe...

متن کامل

Statistical Voice Conversion with WaveNet-Based Waveform Generation

This paper presents a statistical voice conversion (VC) technique with the WaveNet-based waveform generation. VC based on a Gaussian mixture model (GMM) makes it possible to convert the speaker identity of a source speaker into that of a target speaker. However, in the conventional vocoding process, various factors such as F0 extraction errors, parameterization errors and over-smoothing effects...

متن کامل

Using FFI Interpolator and VQ Quantization for Designing of High Quality 1200 BPS Speech Vocoder

Storaging or transmission of speech signals at very low bit rate is a hot area in the field of speech processing. We used stochastic inter-frame interpolators and vector quantization (VQ) as a new method for developing a high quality 1200 BPS speech vocoder. The objective and subjecgtive test results show that performance of the new vocoder is compairable with 4800 BPS standard vocoders (as CELP).

متن کامل

Human Speech Production Based on a Linear Predictive Vocoder – An Interactive Tutorial

This tutorial explains the principle of the human speech production with the aid of a Linear Predictive Vocoder (LPC vocoder) and the use of interactive learning procedures. The components of the human speech organ, namely the excitation and the vocal tract parameters, are computed. The components are then fed into the synthesis part of a vocoder which finally generates a synthesised speech sig...

متن کامل

A Prototype Real-time Plugin Framework for the Phase Vocoder

With the dramatic increase in computing power over the last few years, computationally intensive tasks such as the phase vocoder can now be performed faster than real-time. This paper presents details of the modifications and enhancements to the phase vocoder required to support real-time performance, and describes new implementations both as conventional command-line tools and in the form of p...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

An experimental comparison of multiple vocoder types

نویسندگان

چکیده

منابع مشابه

Speech coding using trajectory compression and multiple sensors

Statistical Voice Conversion with WaveNet-Based Waveform Generation

Using FFI Interpolator and VQ Quantization for Designing of High Quality 1200 BPS Speech Vocoder

Human Speech Production Based on a Linear Predictive Vocoder – An Interactive Tutorial

A Prototype Real-time Plugin Framework for the Phase Vocoder

عنوان ژورنال:

اشتراک گذاری